Overview

Dataset statistics

Number of variables26
Number of observations42370
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.1 MiB
Average record size in memory126.1 B

Variable types

Numeric13
Categorical13

Warnings

zip_code has a high cardinality: 835 distinct values High cardinality
loan is highly correlated with installmentHigh correlation
installment is highly correlated with loanHigh correlation
annual_inc is highly skewed (γ1 = 29.14005363) Skewed
df_index is uniformly distributed Uniform
df_index has unique values Unique
inq_last_6mths has 19599 (46.3%) zeros Zeros
delinq_ago_mths has 27618 (65.2%) zeros Zeros
revol_util has 1068 (2.5%) zeros Zeros

Reproduction

Analysis started2021-03-24 08:33:40.562871
Analysis finished2021-03-24 08:34:01.700359
Duration21.14 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct42370
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21203.13495
Minimum0
Maximum42478
Zeros1
Zeros (%)< 0.1%
Memory size331.1 KiB
2021-03-24T09:34:01.803066image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2118.45
Q110595.25
median21193.5
Q331800.75
95-th percentile40340.55
Maximum42478
Range42478
Interquartile range (IQR)21205.5

Descriptive statistics

Standard deviation12249.90284
Coefficient of variation (CV)0.5777401724
Kurtosis-1.198342957
Mean21203.13495
Median Absolute Deviation (MAD)10603
Skewness0.002283855245
Sum898376828
Variance150060119.7
MonotocityStrictly increasing
2021-03-24T09:34:01.908811image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
109281
 
< 0.1%
150261
 
< 0.1%
129791
 
< 0.1%
27401
 
< 0.1%
6931
 
< 0.1%
68381
 
< 0.1%
47911
 
< 0.1%
273201
 
< 0.1%
252731
 
< 0.1%
Other values (42360)42360
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
ValueCountFrequency (%)
424781
< 0.1%
424771
< 0.1%
424761
< 0.1%
424741
< 0.1%
424721
< 0.1%

loan
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1051
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10843.87538
Minimum500
Maximum35000
Zeros0
Zeros (%)0.0%
Memory size331.1 KiB
2021-03-24T09:34:02.005170image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum500
5-th percentile2311.25
Q15100
median9600
Q315000
95-th percentile25000
Maximum35000
Range34500
Interquartile range (IQR)9900

Descriptive statistics

Standard deviation7147.489138
Coefficient of variation (CV)0.6591268237
Kurtosis0.9434017677
Mean10843.87538
Median Absolute Deviation (MAD)4600
Skewness1.083558714
Sum459455000
Variance51086600.98
MonotocityNot monotonic
2021-03-24T09:34:02.099528image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100002919
 
6.9%
120002345
 
5.5%
50002216
 
5.2%
60002020
 
4.8%
150001896
 
4.5%
80001681
 
4.0%
200001543
 
3.6%
40001224
 
2.9%
250001222
 
2.9%
30001114
 
2.6%
Other values (1041)24190
57.1%
ValueCountFrequency (%)
50011
< 0.1%
5501
 
< 0.1%
6006
< 0.1%
7002
 
< 0.1%
7251
 
< 0.1%
ValueCountFrequency (%)
35000559
1.3%
348001
 
< 0.1%
346752
 
< 0.1%
345251
 
< 0.1%
344754
 
< 0.1%

term
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.6 KiB
36 months
31373 
60 months
10997 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters423700
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 36 months
2nd row 60 months
3rd row 36 months
4th row 36 months
5th row 60 months
ValueCountFrequency (%)
36 months31373
74.0%
60 months10997
 
26.0%
2021-03-24T09:34:02.275519image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-24T09:34:02.326670image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
months42370
50.0%
3631373
37.0%
6010997
 
13.0%

Most occurring characters

ValueCountFrequency (%)
84740
20.0%
642370
10.0%
m42370
10.0%
o42370
10.0%
n42370
10.0%
t42370
10.0%
h42370
10.0%
s42370
10.0%
331373
 
7.4%
010997
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter254220
60.0%
Space Separator84740
 
20.0%
Decimal Number84740
 
20.0%

Most frequent character per category

ValueCountFrequency (%)
m42370
16.7%
o42370
16.7%
n42370
16.7%
t42370
16.7%
h42370
16.7%
s42370
16.7%
ValueCountFrequency (%)
642370
50.0%
331373
37.0%
010997
 
13.0%
ValueCountFrequency (%)
84740
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin254220
60.0%
Common169480
40.0%

Most frequent character per script

ValueCountFrequency (%)
m42370
16.7%
o42370
16.7%
n42370
16.7%
t42370
16.7%
h42370
16.7%
s42370
16.7%
ValueCountFrequency (%)
84740
50.0%
642370
25.0%
331373
 
18.5%
010997
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII423700
100.0%

Most frequent character per block

ValueCountFrequency (%)
84740
20.0%
642370
10.0%
m42370
10.0%
o42370
10.0%
n42370
10.0%
t42370
10.0%
h42370
10.0%
s42370
10.0%
331373
 
7.4%
010997
 
2.6%

int_rate
Real number (ℝ≥0)

Distinct394
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.164367
Minimum5.42
Maximum24.59
Zeros0
Zeros (%)0.0%
Memory size331.1 KiB
2021-03-24T09:34:02.391217image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum5.42
5-th percentile6.54
Q19.63
median11.99
Q314.72
95-th percentile18.62
Maximum24.59
Range19.17
Interquartile range (IQR)5.09

Descriptive statistics

Standard deviation3.708500335
Coefficient of variation (CV)0.3048658704
Kurtosis-0.4747035457
Mean12.164367
Median Absolute Deviation (MAD)2.66
Skewness0.2383719897
Sum515404.23
Variance13.75297473
MonotocityNot monotonic
2021-03-24T09:34:02.490252image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10.99970
 
2.3%
11.49837
 
2.0%
13.49832
 
2.0%
7.51787
 
1.9%
7.88742
 
1.8%
7.49656
 
1.5%
11.71609
 
1.4%
9.99607
 
1.4%
7.9582
 
1.4%
5.42573
 
1.4%
Other values (384)35175
83.0%
ValueCountFrequency (%)
5.42573
1.4%
5.79410
1.0%
5.99347
0.8%
619
 
< 0.1%
6.03447
1.1%
ValueCountFrequency (%)
24.591
 
< 0.1%
24.41
 
< 0.1%
24.113
 
< 0.1%
23.9111
< 0.1%
23.594
 
< 0.1%

installment
Real number (ℝ≥0)

HIGH CORRELATION

Distinct16381
Distinct (%)38.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean323.2224836
Minimum15.67
Maximum1305.19
Zeros0
Zeros (%)0.0%
Memory size331.1 KiB
2021-03-24T09:34:02.595318image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum15.67
5-th percentile69.94
Q1165.82
median278.48
Q3429.18
95-th percentile763.2625
Maximum1305.19
Range1289.52
Interquartile range (IQR)263.36

Descriptive statistics

Standard deviation208.9468873
Coefficient of variation (CV)0.6464491114
Kurtosis1.200234886
Mean323.2224836
Median Absolute Deviation (MAD)122.72
Skewness1.123163223
Sum13694936.63
Variance43658.80171
MonotocityNot monotonic
2021-03-24T09:34:02.694559image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
311.1168
 
0.2%
180.9659
 
0.1%
311.0254
 
0.1%
150.848
 
0.1%
368.4546
 
0.1%
372.1245
 
0.1%
330.7643
 
0.1%
317.7242
 
0.1%
339.3142
 
0.1%
186.6141
 
0.1%
Other values (16371)41882
98.8%
ValueCountFrequency (%)
15.671
< 0.1%
15.691
< 0.1%
15.751
< 0.1%
15.761
< 0.1%
15.911
< 0.1%
ValueCountFrequency (%)
1305.191
< 0.1%
1302.691
< 0.1%
1295.211
< 0.1%
1288.12
< 0.1%
1283.51
< 0.1%

grade
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.8 KiB
B
12364 
A
10152 
C
8704 
D
5985 
E
3370 
Other values (2)
1795 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters42370
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowC
3rd rowC
4th rowC
5th rowB
ValueCountFrequency (%)
B12364
29.2%
A10152
24.0%
C8704
20.5%
D5985
14.1%
E3370
 
8.0%
F1288
 
3.0%
G507
 
1.2%
2021-03-24T09:34:03.028076image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-24T09:34:03.082186image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
b12364
29.2%
a10152
24.0%
c8704
20.5%
d5985
14.1%
e3370
 
8.0%
f1288
 
3.0%
g507
 
1.2%

Most occurring characters

ValueCountFrequency (%)
B12364
29.2%
A10152
24.0%
C8704
20.5%
D5985
14.1%
E3370
 
8.0%
F1288
 
3.0%
G507
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter42370
100.0%

Most frequent character per category

ValueCountFrequency (%)
B12364
29.2%
A10152
24.0%
C8704
20.5%
D5985
14.1%
E3370
 
8.0%
F1288
 
3.0%
G507
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Latin42370
100.0%

Most frequent character per script

ValueCountFrequency (%)
B12364
29.2%
A10152
24.0%
C8704
20.5%
D5985
14.1%
E3370
 
8.0%
F1288
 
3.0%
G507
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII42370
100.0%

Most frequent character per block

ValueCountFrequency (%)
B12364
29.2%
A10152
24.0%
C8704
20.5%
D5985
14.1%
E3370
 
8.0%
F1288
 
3.0%
G507
 
1.2%

empl_years
Categorical

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.9 KiB
10+ years
9357 
< 1 year
5005 
2 years
4727 
3 years
4354 
4 years
3635 
Other values (7)
15292 

Length

Max length10
Median length7
Mean length7.55411848
Min length6

Characters and Unicode

Total characters320068
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10+ years
2nd row< 1 year
3rd row10+ years
4th row10+ years
5th row1 year
ValueCountFrequency (%)
10+ years9357
22.1%
< 1 year5005
11.8%
2 years4727
11.2%
3 years4354
10.3%
4 years3635
 
8.6%
1 year3568
 
8.4%
5 years3452
 
8.1%
6 years2370
 
5.6%
7 years1869
 
4.4%
8 years1587
 
3.7%
Other values (2)2446
 
5.8%
2021-03-24T09:34:03.238390image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
years32688
36.9%
109357
 
10.6%
18573
 
9.7%
year8573
 
9.7%
5005
 
5.6%
24727
 
5.3%
34354
 
4.9%
43635
 
4.1%
53452
 
3.9%
62370
 
2.7%
Other values (4)5902
 
6.7%

Most occurring characters

ValueCountFrequency (%)
46266
14.5%
e43479
13.6%
y42370
13.2%
a41261
12.9%
r41261
12.9%
s32688
10.2%
117930
 
5.6%
09357
 
2.9%
+9357
 
2.9%
<5005
 
1.6%
Other values (15)31094
9.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter208822
65.2%
Decimal Number50618
 
15.8%
Space Separator46266
 
14.5%
Math Symbol14362
 
4.5%

Most frequent character per category

ValueCountFrequency (%)
e43479
20.8%
y42370
20.3%
a41261
19.8%
r41261
19.8%
s32688
15.7%
u1109
 
0.5%
n1109
 
0.5%
m1109
 
0.5%
p1109
 
0.5%
l1109
 
0.5%
Other values (2)2218
 
1.1%
ValueCountFrequency (%)
117930
35.4%
09357
18.5%
24727
 
9.3%
34354
 
8.6%
43635
 
7.2%
53452
 
6.8%
62370
 
4.7%
71869
 
3.7%
81587
 
3.1%
91337
 
2.6%
ValueCountFrequency (%)
+9357
65.2%
<5005
34.8%
ValueCountFrequency (%)
46266
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin208822
65.2%
Common111246
34.8%

Most frequent character per script

ValueCountFrequency (%)
46266
41.6%
117930
 
16.1%
09357
 
8.4%
+9357
 
8.4%
<5005
 
4.5%
24727
 
4.2%
34354
 
3.9%
43635
 
3.3%
53452
 
3.1%
62370
 
2.1%
Other values (3)4793
 
4.3%
ValueCountFrequency (%)
e43479
20.8%
y42370
20.3%
a41261
19.8%
r41261
19.8%
s32688
15.7%
u1109
 
0.5%
n1109
 
0.5%
m1109
 
0.5%
p1109
 
0.5%
l1109
 
0.5%
Other values (2)2218
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII320068
100.0%

Most frequent character per block

ValueCountFrequency (%)
46266
14.5%
e43479
13.6%
y42370
13.2%
a41261
12.9%
r41261
12.9%
s32688
10.2%
117930
 
5.6%
09357
 
2.9%
+9357
 
2.9%
<5005
 
1.6%
Other values (15)31094
9.7%

home_ownership
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 KiB
RENT
20093 
MORTGAGE
18907 
OWN
3234 
OTHER
 
136

Length

Max length8
Median length4
Mean length5.711824404
Min length3

Characters and Unicode

Total characters242010
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRENT
2nd rowRENT
3rd rowRENT
4th rowRENT
5th rowRENT
ValueCountFrequency (%)
RENT20093
47.4%
MORTGAGE18907
44.6%
OWN3234
 
7.6%
OTHER136
 
0.3%
2021-03-24T09:34:03.405167image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-24T09:34:03.453599image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
rent20093
47.4%
mortgage18907
44.6%
own3234
 
7.6%
other136
 
0.3%

Most occurring characters

ValueCountFrequency (%)
R39136
16.2%
E39136
16.2%
T39136
16.2%
G37814
15.6%
N23327
9.6%
O22277
9.2%
M18907
7.8%
A18907
7.8%
W3234
 
1.3%
H136
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter242010
100.0%

Most frequent character per category

ValueCountFrequency (%)
R39136
16.2%
E39136
16.2%
T39136
16.2%
G37814
15.6%
N23327
9.6%
O22277
9.2%
M18907
7.8%
A18907
7.8%
W3234
 
1.3%
H136
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin242010
100.0%

Most frequent character per script

ValueCountFrequency (%)
R39136
16.2%
E39136
16.2%
T39136
16.2%
G37814
15.6%
N23327
9.6%
O22277
9.2%
M18907
7.8%
A18907
7.8%
W3234
 
1.3%
H136
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII242010
100.0%

Most frequent character per block

ValueCountFrequency (%)
R39136
16.2%
E39136
16.2%
T39136
16.2%
G37814
15.6%
N23327
9.6%
O22277
9.2%
M18907
7.8%
A18907
7.8%
W3234
 
1.3%
H136
 
0.1%

annual_inc
Real number (ℝ≥0)

SKEWED

Distinct5587
Distinct (%)13.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69162.58756
Minimum1896
Maximum6000000
Zeros0
Zeros (%)0.0%
Memory size331.1 KiB
2021-03-24T09:34:03.527499image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1896
5-th percentile24000
Q140000
median59000
Q382500
95-th percentile144000
Maximum6000000
Range5998104
Interquartile range (IQR)42500

Descriptive statistics

Standard deviation64092.73559
Coefficient of variation (CV)0.9266966122
Kurtosis2125.792689
Mean69162.58756
Median Absolute Deviation (MAD)20000
Skewness29.14005363
Sum2930418835
Variance4107878755
MonotocityNot monotonic
2021-03-24T09:34:03.623089image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
600001587
 
3.7%
500001115
 
2.6%
40000930
 
2.2%
45000893
 
2.1%
30000880
 
2.1%
75000863
 
2.0%
65000838
 
2.0%
70000789
 
1.9%
48000763
 
1.8%
80000715
 
1.7%
Other values (5577)32997
77.9%
ValueCountFrequency (%)
18961
< 0.1%
20001
< 0.1%
33001
< 0.1%
35001
< 0.1%
36001
< 0.1%
ValueCountFrequency (%)
60000001
< 0.1%
39000001
< 0.1%
20397841
< 0.1%
19000001
< 0.1%
17820001
< 0.1%

income_ver
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.6 KiB
Not Verified
18616 
Verified
13460 
Source Verified
10294 

Length

Max length15
Median length12
Mean length11.45815435
Min length8

Characters and Unicode

Total characters485482
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowVerified
2nd rowSource Verified
3rd rowNot Verified
4th rowSource Verified
5th rowSource Verified
ValueCountFrequency (%)
Not Verified18616
43.9%
Verified13460
31.8%
Source Verified10294
24.3%
2021-03-24T09:34:03.788360image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-24T09:34:03.843351image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
verified42370
59.4%
not18616
26.1%
source10294
 
14.4%

Most occurring characters

ValueCountFrequency (%)
e95034
19.6%
i84740
17.5%
r52664
10.8%
V42370
8.7%
f42370
8.7%
d42370
8.7%
o28910
 
6.0%
28910
 
6.0%
N18616
 
3.8%
t18616
 
3.8%
Other values (3)30882
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter385292
79.4%
Uppercase Letter71280
 
14.7%
Space Separator28910
 
6.0%

Most frequent character per category

ValueCountFrequency (%)
e95034
24.7%
i84740
22.0%
r52664
13.7%
f42370
11.0%
d42370
11.0%
o28910
 
7.5%
t18616
 
4.8%
u10294
 
2.7%
c10294
 
2.7%
ValueCountFrequency (%)
V42370
59.4%
N18616
26.1%
S10294
 
14.4%
ValueCountFrequency (%)
28910
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin456572
94.0%
Common28910
 
6.0%

Most frequent character per script

ValueCountFrequency (%)
e95034
20.8%
i84740
18.6%
r52664
11.5%
V42370
9.3%
f42370
9.3%
d42370
9.3%
o28910
 
6.3%
N18616
 
4.1%
t18616
 
4.1%
S10294
 
2.3%
Other values (2)20588
 
4.5%
ValueCountFrequency (%)
28910
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII485482
100.0%

Most frequent character per block

ValueCountFrequency (%)
e95034
19.6%
i84740
17.5%
r52664
10.8%
V42370
8.7%
f42370
8.7%
d42370
8.7%
o28910
 
6.0%
28910
 
6.0%
N18616
 
3.8%
t18616
 
3.8%
Other values (3)30882
 
6.4%

target
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size331.1 KiB
1.0
35969 
0.0
6401 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters127110
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row1.0
4th row1.0
5th row1.0
ValueCountFrequency (%)
1.035969
84.9%
0.06401
 
15.1%
2021-03-24T09:34:03.977869image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-24T09:34:04.026623image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
1.035969
84.9%
0.06401
 
15.1%

Most occurring characters

ValueCountFrequency (%)
048771
38.4%
.42370
33.3%
135969
28.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number84740
66.7%
Other Punctuation42370
33.3%

Most frequent character per category

ValueCountFrequency (%)
048771
57.6%
135969
42.4%
ValueCountFrequency (%)
.42370
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common127110
100.0%

Most frequent character per script

ValueCountFrequency (%)
048771
38.4%
.42370
33.3%
135969
28.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII127110
100.0%

Most frequent character per block

ValueCountFrequency (%)
048771
38.4%
.42370
33.3%
135969
28.3%

purpose
Categorical

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size42.2 KiB
debt_consolidation
19739 
credit_card
5452 
other
4375 
home_improvement
3187 
major_purchase
2304 
Other values (9)
7313 

Length

Max length18
Median length16
Mean length13.70261978
Min length3

Characters and Unicode

Total characters580580
Distinct characters22
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcredit_card
2nd rowcar
3rd rowsmall_business
4th rowother
5th rowother
ValueCountFrequency (%)
debt_consolidation19739
46.6%
credit_card5452
 
12.9%
other4375
 
10.3%
home_improvement3187
 
7.5%
major_purchase2304
 
5.4%
small_business1987
 
4.7%
car1610
 
3.8%
wedding1003
 
2.4%
medical751
 
1.8%
moving624
 
1.5%
Other values (4)1338
 
3.2%
2021-03-24T09:34:04.179979image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
debt_consolidation19739
46.6%
credit_card5452
 
12.9%
other4375
 
10.3%
home_improvement3187
 
7.5%
major_purchase2304
 
5.4%
small_business1987
 
4.7%
car1610
 
3.8%
wedding1003
 
2.4%
medical751
 
1.8%
moving624
 
1.5%
Other values (4)1338
 
3.2%

Most occurring characters

ValueCountFrequency (%)
o74126
12.8%
d53550
9.2%
t53300
9.2%
i53290
9.2%
n47299
8.1%
e46537
 
8.0%
c36116
 
6.2%
a35869
 
6.2%
_32775
 
5.6%
s30415
 
5.2%
Other values (12)117303
20.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter547805
94.4%
Connector Punctuation32775
 
5.6%

Most frequent character per category

ValueCountFrequency (%)
o74126
13.5%
d53550
9.8%
t53300
9.7%
i53290
9.7%
n47299
8.6%
e46537
8.5%
c36116
 
6.6%
a35869
 
6.5%
s30415
 
5.6%
l24981
 
4.6%
Other values (11)92322
16.9%
ValueCountFrequency (%)
_32775
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin547805
94.4%
Common32775
 
5.6%

Most frequent character per script

ValueCountFrequency (%)
o74126
13.5%
d53550
9.8%
t53300
9.7%
i53290
9.7%
n47299
8.6%
e46537
8.5%
c36116
 
6.6%
a35869
 
6.5%
s30415
 
5.6%
l24981
 
4.6%
Other values (11)92322
16.9%
ValueCountFrequency (%)
_32775
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII580580
100.0%

Most frequent character per block

ValueCountFrequency (%)
o74126
12.8%
d53550
9.2%
t53300
9.2%
i53290
9.2%
n47299
8.1%
e46537
 
8.0%
c36116
 
6.2%
a35869
 
6.2%
_32775
 
5.6%
s30415
 
5.2%
Other values (12)117303
20.2%

zip_code
Categorical

HIGH CARDINALITY

Distinct835
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size121.7 KiB
100xx
 
635
945xx
 
559
606xx
 
547
112xx
 
537
070xx
 
501
Other values (830)
39591 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters211850
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique53 ?
Unique (%)0.1%

Sample

1st row860xx
2nd row309xx
3rd row606xx
4th row917xx
5th row972xx
ValueCountFrequency (%)
100xx635
 
1.5%
945xx559
 
1.3%
606xx547
 
1.3%
112xx537
 
1.3%
070xx501
 
1.2%
900xx477
 
1.1%
300xx433
 
1.0%
021xx413
 
1.0%
750xx392
 
0.9%
926xx387
 
0.9%
Other values (825)37489
88.5%
2021-03-24T09:34:04.371434image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
100xx635
 
1.5%
945xx559
 
1.3%
606xx547
 
1.3%
112xx537
 
1.3%
070xx501
 
1.2%
900xx477
 
1.1%
300xx433
 
1.0%
021xx413
 
1.0%
750xx392
 
0.9%
926xx387
 
0.9%
Other values (825)37489
88.5%

Most occurring characters

ValueCountFrequency (%)
x84740
40.0%
021148
 
10.0%
116614
 
7.8%
214375
 
6.8%
913388
 
6.3%
313285
 
6.3%
710941
 
5.2%
49762
 
4.6%
59624
 
4.5%
89282
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number127110
60.0%
Lowercase Letter84740
40.0%

Most frequent character per category

ValueCountFrequency (%)
021148
16.6%
116614
13.1%
214375
11.3%
913388
10.5%
313285
10.5%
710941
8.6%
49762
7.7%
59624
7.6%
89282
7.3%
68691
6.8%
ValueCountFrequency (%)
x84740
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common127110
60.0%
Latin84740
40.0%

Most frequent character per script

ValueCountFrequency (%)
021148
16.6%
116614
13.1%
214375
11.3%
913388
10.5%
313285
10.5%
710941
8.6%
49762
7.7%
59624
7.6%
89282
7.3%
68691
6.8%
ValueCountFrequency (%)
x84740
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII211850
100.0%

Most frequent character per block

ValueCountFrequency (%)
x84740
40.0%
021148
 
10.0%
116614
 
7.8%
214375
 
6.8%
913388
 
6.3%
313285
 
6.3%
710941
 
5.2%
49762
 
4.6%
59624
 
4.5%
89282
 
4.4%

state
Categorical

Distinct50
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size43.9 KiB
CA
7421 
NY
4045 
FL
3087 
TX
2908 
NJ
 
1978
Other values (45)
22931 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters84740
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAZ
2nd rowGA
3rd rowIL
4th rowCA
5th rowOR
ValueCountFrequency (%)
CA7421
17.5%
NY4045
 
9.5%
FL3087
 
7.3%
TX2908
 
6.9%
NJ1978
 
4.7%
IL1670
 
3.9%
PA1645
 
3.9%
GA1495
 
3.5%
VA1484
 
3.5%
MA1417
 
3.3%
Other values (40)15220
35.9%
2021-03-24T09:34:04.558184image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ca7421
17.5%
ny4045
 
9.5%
fl3087
 
7.3%
tx2908
 
6.9%
nj1978
 
4.7%
il1670
 
3.9%
pa1645
 
3.9%
ga1495
 
3.5%
va1484
 
3.5%
ma1417
 
3.3%
Other values (40)15220
35.9%

Most occurring characters

ValueCountFrequency (%)
A16579
19.6%
C10619
12.5%
N8474
10.0%
L5700
 
6.7%
M5073
 
6.0%
Y4489
 
5.3%
T4181
 
4.9%
O3722
 
4.4%
I3398
 
4.0%
F3087
 
3.6%
Other values (14)19418
22.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter84740
100.0%

Most frequent character per category

ValueCountFrequency (%)
A16579
19.6%
C10619
12.5%
N8474
10.0%
L5700
 
6.7%
M5073
 
6.0%
Y4489
 
5.3%
T4181
 
4.9%
O3722
 
4.4%
I3398
 
4.0%
F3087
 
3.6%
Other values (14)19418
22.9%

Most occurring scripts

ValueCountFrequency (%)
Latin84740
100.0%

Most frequent character per script

ValueCountFrequency (%)
A16579
19.6%
C10619
12.5%
N8474
10.0%
L5700
 
6.7%
M5073
 
6.0%
Y4489
 
5.3%
T4181
 
4.9%
O3722
 
4.4%
I3398
 
4.0%
F3087
 
3.6%
Other values (14)19418
22.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII84740
100.0%

Most frequent character per block

ValueCountFrequency (%)
A16579
19.6%
C10619
12.5%
N8474
10.0%
L5700
 
6.7%
M5073
 
6.0%
Y4489
 
5.3%
T4181
 
4.9%
O3722
 
4.4%
I3398
 
4.0%
F3087
 
3.6%
Other values (14)19418
22.9%

dti_ratio
Real number (ℝ≥0)

Distinct2894
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.3886684
Minimum0
Maximum29.99
Zeros197
Zeros (%)0.5%
Memory size331.1 KiB
2021-03-24T09:34:04.640616image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.11
Q18.21
median13.49
Q318.7
95-th percentile23.92
Maximum29.99
Range29.99
Interquartile range (IQR)10.49

Descriptive statistics

Standard deviation6.723157255
Coefficient of variation (CV)0.5021527948
Kurtosis-0.8515042195
Mean13.3886684
Median Absolute Deviation (MAD)5.24
Skewness-0.03080037214
Sum567277.88
Variance45.20084347
MonotocityNot monotonic
2021-03-24T09:34:04.732551image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0197
 
0.5%
1254
 
0.1%
1846
 
0.1%
19.245
 
0.1%
13.243
 
0.1%
16.841
 
0.1%
13.541
 
0.1%
12.4840
 
0.1%
14.2937
 
0.1%
4.837
 
0.1%
Other values (2884)41789
98.6%
ValueCountFrequency (%)
0197
0.5%
0.013
 
< 0.1%
0.025
 
< 0.1%
0.032
 
< 0.1%
0.043
 
< 0.1%
ValueCountFrequency (%)
29.991
 
< 0.1%
29.961
 
< 0.1%
29.952
< 0.1%
29.933
< 0.1%
29.922
< 0.1%

delinq_2yrs
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.6 KiB
no
37652 
yes
4718 

Length

Max length3
Median length2
Mean length2.111352372
Min length2

Characters and Unicode

Total characters89458
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowno
2nd rowno
3rd rowno
4th rowno
5th rowno
ValueCountFrequency (%)
no37652
88.9%
yes4718
 
11.1%
2021-03-24T09:34:04.897930image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-24T09:34:04.948084image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
no37652
88.9%
yes4718
 
11.1%

Most occurring characters

ValueCountFrequency (%)
n37652
42.1%
o37652
42.1%
y4718
 
5.3%
e4718
 
5.3%
s4718
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter89458
100.0%

Most frequent character per category

ValueCountFrequency (%)
n37652
42.1%
o37652
42.1%
y4718
 
5.3%
e4718
 
5.3%
s4718
 
5.3%

Most occurring scripts

ValueCountFrequency (%)
Latin89458
100.0%

Most frequent character per script

ValueCountFrequency (%)
n37652
42.1%
o37652
42.1%
y4718
 
5.3%
e4718
 
5.3%
s4718
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII89458
100.0%

Most frequent character per block

ValueCountFrequency (%)
n37652
42.1%
o37652
42.1%
y4718
 
5.3%
e4718
 
5.3%
s4718
 
5.3%

fico
Real number (ℝ≥0)

Distinct44
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean713.0837857
Minimum610
Maximum825
Zeros0
Zeros (%)0.0%
Memory size331.1 KiB
2021-03-24T09:34:05.008561image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum610
5-th percentile665
Q1685
median710
Q3740
95-th percentile780
Maximum825
Range215
Interquartile range (IQR)55

Descriptive statistics

Standard deviation36.14930141
Coefficient of variation (CV)0.05069432532
Kurtosis-0.498679629
Mean713.0837857
Median Absolute Deviation (MAD)25
Skewness0.466343402
Sum30213360
Variance1306.771993
MonotocityNot monotonic
2021-03-24T09:34:05.097942image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
6852303
 
5.4%
7002262
 
5.3%
6802214
 
5.2%
6952194
 
5.2%
6902189
 
5.2%
6751990
 
4.7%
7051964
 
4.6%
7201943
 
4.6%
7251890
 
4.5%
7151884
 
4.4%
Other values (34)21537
50.8%
ValueCountFrequency (%)
6101
 
< 0.1%
6151
 
< 0.1%
6201
 
< 0.1%
6251
 
< 0.1%
6304
< 0.1%
ValueCountFrequency (%)
8253
 
< 0.1%
82019
 
< 0.1%
81528
 
0.1%
810124
0.3%
805193
0.5%

inq_last_6mths
Real number (ℝ≥0)

ZEROS

Distinct28
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.079631815
Minimum0
Maximum33
Zeros19599
Zeros (%)46.3%
Memory size331.1 KiB
2021-03-24T09:34:05.179081image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile4
Maximum33
Range33
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.521101951
Coefficient of variation (CV)1.408908046
Kurtosis30.93928581
Mean1.079631815
Median Absolute Deviation (MAD)1
Skewness3.430891172
Sum45744
Variance2.313751145
MonotocityNot monotonic
2021-03-24T09:34:05.255447image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
019599
46.3%
111209
26.5%
25973
 
14.1%
33173
 
7.5%
41051
 
2.5%
5595
 
1.4%
6335
 
0.8%
7181
 
0.4%
8115
 
0.3%
947
 
0.1%
Other values (18)92
 
0.2%
ValueCountFrequency (%)
019599
46.3%
111209
26.5%
25973
 
14.1%
33173
 
7.5%
41051
 
2.5%
ValueCountFrequency (%)
331
< 0.1%
321
< 0.1%
311
< 0.1%
281
< 0.1%
271
< 0.1%

delinq_ago_mths
Real number (ℝ≥0)

ZEROS

Distinct95
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.87307057
Minimum0
Maximum120
Zeros27618
Zeros (%)65.2%
Memory size331.1 KiB
2021-03-24T09:34:05.344437image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q321
95-th percentile64
Maximum120
Range120
Interquartile range (IQR)21

Descriptive statistics

Standard deviation21.67389845
Coefficient of variation (CV)1.6836619
Kurtosis1.346611977
Mean12.87307057
Median Absolute Deviation (MAD)0
Skewness1.587985632
Sum545432
Variance469.7578742
MonotocityNot monotonic
2021-03-24T09:34:05.442632image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
027618
65.2%
30270
 
0.6%
23266
 
0.6%
19264
 
0.6%
15262
 
0.6%
24261
 
0.6%
18252
 
0.6%
38251
 
0.6%
20249
 
0.6%
22248
 
0.6%
Other values (85)12429
29.3%
ValueCountFrequency (%)
027618
65.2%
131
 
0.1%
2114
 
0.3%
3157
 
0.4%
4162
 
0.4%
ValueCountFrequency (%)
1201
< 0.1%
1151
< 0.1%
1071
< 0.1%
1061
< 0.1%
1032
< 0.1%

active_lines
Real number (ℝ≥0)

Distinct44
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.35206514
Minimum1
Maximum47
Zeros0
Zeros (%)0.0%
Memory size331.1 KiB
2021-03-24T09:34:05.533840image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q16
median9
Q312
95-th percentile18
Maximum47
Range46
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.492722011
Coefficient of variation (CV)0.4803989219
Kurtosis1.941316969
Mean9.35206514
Median Absolute Deviation (MAD)3
Skewness1.044303957
Sum396247
Variance20.18455107
MonotocityNot monotonic
2021-03-24T09:34:05.624700image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
74243
10.0%
84166
9.8%
64163
9.8%
93917
9.2%
103377
 
8.0%
53357
 
7.9%
112940
 
6.9%
42498
 
5.9%
122392
 
5.6%
132058
 
4.9%
Other values (34)9259
21.9%
ValueCountFrequency (%)
134
 
0.1%
2669
 
1.6%
31590
3.8%
42498
5.9%
53357
7.9%
ValueCountFrequency (%)
471
< 0.1%
461
< 0.1%
441
< 0.1%
421
< 0.1%
411
< 0.1%

pub_rec
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.6 KiB
no
40001 
yes
 
2369

Length

Max length3
Median length2
Mean length2.055912202
Min length2

Characters and Unicode

Total characters87109
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowno
2nd rowno
3rd rowno
4th rowno
5th rowno
ValueCountFrequency (%)
no40001
94.4%
yes2369
 
5.6%
2021-03-24T09:34:05.791182image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-24T09:34:05.842877image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
no40001
94.4%
yes2369
 
5.6%

Most occurring characters

ValueCountFrequency (%)
n40001
45.9%
o40001
45.9%
y2369
 
2.7%
e2369
 
2.7%
s2369
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter87109
100.0%

Most frequent character per category

ValueCountFrequency (%)
n40001
45.9%
o40001
45.9%
y2369
 
2.7%
e2369
 
2.7%
s2369
 
2.7%

Most occurring scripts

ValueCountFrequency (%)
Latin87109
100.0%

Most frequent character per script

ValueCountFrequency (%)
n40001
45.9%
o40001
45.9%
y2369
 
2.7%
e2369
 
2.7%
s2369
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII87109
100.0%

Most frequent character per block

ValueCountFrequency (%)
n40001
45.9%
o40001
45.9%
y2369
 
2.7%
e2369
 
2.7%
s2369
 
2.7%

revol_util
Real number (ℝ≥0)

ZEROS

Distinct1118
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.12923696
Minimum0
Maximum119
Zeros1068
Zeros (%)2.5%
Memory size331.1 KiB
2021-03-24T09:34:05.902576image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.7
Q125.7
median49.7
Q372.7
95-th percentile93.7
Maximum119
Range119
Interquartile range (IQR)47

Descriptive statistics

Standard deviation28.36336415
Coefficient of variation (CV)0.5773214872
Kurtosis-1.101430397
Mean49.12923696
Median Absolute Deviation (MAD)23.5
Skewness-0.04415707083
Sum2081605.77
Variance804.4804258
MonotocityNot monotonic
2021-03-24T09:34:06.000196image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01068
 
2.5%
40.765
 
0.2%
0.264
 
0.2%
6363
 
0.1%
66.662
 
0.1%
70.461
 
0.1%
0.161
 
0.1%
64.660
 
0.1%
66.759
 
0.1%
46.459
 
0.1%
Other values (1108)40748
96.2%
ValueCountFrequency (%)
01068
2.5%
0.011
 
< 0.1%
0.031
 
< 0.1%
0.041
 
< 0.1%
0.051
 
< 0.1%
ValueCountFrequency (%)
1191
< 0.1%
108.81
< 0.1%
106.51
< 0.1%
106.41
< 0.1%
106.21
< 0.1%

total_lines
Real number (ℝ≥0)

Distinct83
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.15107387
Minimum1
Maximum90
Zeros0
Zeros (%)0.0%
Memory size331.1 KiB
2021-03-24T09:34:06.100812image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile7
Q113
median20
Q329
95-th percentile44
Maximum90
Range89
Interquartile range (IQR)16

Descriptive statistics

Standard deviation11.58286289
Coefficient of variation (CV)0.5229029959
Kurtosis0.6621439852
Mean22.15107387
Median Absolute Deviation (MAD)8
Skewness0.8236095142
Sum938541
Variance134.1627127
MonotocityNot monotonic
2021-03-24T09:34:06.191894image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
151551
 
3.7%
171543
 
3.6%
161540
 
3.6%
141527
 
3.6%
201502
 
3.5%
181491
 
3.5%
211481
 
3.5%
131476
 
3.5%
121412
 
3.3%
191402
 
3.3%
Other values (73)27445
64.8%
ValueCountFrequency (%)
120
 
< 0.1%
236
 
0.1%
3228
 
0.5%
4474
1.1%
5611
1.4%
ValueCountFrequency (%)
901
< 0.1%
871
< 0.1%
811
< 0.1%
801
< 0.1%
792
< 0.1%

times_bankrupted
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.6 KiB
no
40517 
yes
 
1853

Length

Max length3
Median length2
Mean length2.043733774
Min length2

Characters and Unicode

Total characters86593
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowno
2nd rowno
3rd rowno
4th rowno
5th rowno
ValueCountFrequency (%)
no40517
95.6%
yes1853
 
4.4%
2021-03-24T09:34:06.366116image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-24T09:34:06.417782image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
no40517
95.6%
yes1853
 
4.4%

Most occurring characters

ValueCountFrequency (%)
n40517
46.8%
o40517
46.8%
y1853
 
2.1%
e1853
 
2.1%
s1853
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter86593
100.0%

Most frequent character per category

ValueCountFrequency (%)
n40517
46.8%
o40517
46.8%
y1853
 
2.1%
e1853
 
2.1%
s1853
 
2.1%

Most occurring scripts

ValueCountFrequency (%)
Latin86593
100.0%

Most frequent character per script

ValueCountFrequency (%)
n40517
46.8%
o40517
46.8%
y1853
 
2.1%
e1853
 
2.1%
s1853
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII86593
100.0%

Most frequent character per block

ValueCountFrequency (%)
n40517
46.8%
o40517
46.8%
y1853
 
2.1%
e1853
 
2.1%
s1853
 
2.1%

tax_liens
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.6 KiB
no
42369 
yes
 
1

Length

Max length3
Median length2
Mean length2.000023602
Min length2

Characters and Unicode

Total characters84741
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowno
2nd rowno
3rd rowno
4th rowno
5th rowno
ValueCountFrequency (%)
no42369
> 99.9%
yes1
 
< 0.1%
2021-03-24T09:34:06.539784image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-24T09:34:06.592343image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
no42369
> 99.9%
yes1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n42369
50.0%
o42369
50.0%
y1
 
< 0.1%
e1
 
< 0.1%
s1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter84741
100.0%

Most frequent character per category

ValueCountFrequency (%)
n42369
50.0%
o42369
50.0%
y1
 
< 0.1%
e1
 
< 0.1%
s1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin84741
100.0%

Most frequent character per script

ValueCountFrequency (%)
n42369
50.0%
o42369
50.0%
y1
 
< 0.1%
e1
 
< 0.1%
s1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII84741
100.0%

Most frequent character per block

ValueCountFrequency (%)
n42369
50.0%
o42369
50.0%
y1
 
< 0.1%
e1
 
< 0.1%
s1
 
< 0.1%

credit_hist
Real number (ℝ≥0)

Distinct552
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean163.6889544
Minimum6
Maximum784
Zeros0
Zeros (%)0.0%
Memory size331.1 KiB
2021-03-24T09:34:06.654474image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile55
Q1107
median149
Q3203
95-th percentile321.55
Maximum784
Range778
Interquartile range (IQR)96

Descriptive statistics

Standard deviation82.6533804
Coefficient of variation (CV)0.504941709
Kurtosis2.0004277
Mean163.6889544
Median Absolute Deviation (MAD)47
Skewness1.145237095
Sum6935501
Variance6831.581291
MonotocityNot monotonic
2021-03-24T09:34:06.948797image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
144371
 
0.9%
119361
 
0.9%
141352
 
0.8%
155344
 
0.8%
146340
 
0.8%
139339
 
0.8%
132337
 
0.8%
130336
 
0.8%
125334
 
0.8%
123334
 
0.8%
Other values (542)38922
91.9%
ValueCountFrequency (%)
64
< 0.1%
72
 
< 0.1%
86
< 0.1%
96
< 0.1%
106
< 0.1%
ValueCountFrequency (%)
7841
< 0.1%
7241
< 0.1%
6841
< 0.1%
6721
< 0.1%
6561
< 0.1%

Interactions

2021-03-24T09:33:46.869463image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:46.961950image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:47.048528image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:47.130354image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:47.213294image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:47.295800image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:47.374518image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:47.456386image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:47.538720image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:47.615780image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:47.697086image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:47.773894image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:47.860027image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:48.023284image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:48.119814image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:48.210626image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:48.301266image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:48.391285image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:48.475989image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:48.565365image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:48.658173image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:48.743798image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:48.833266image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:48.918836image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:49.012191image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:49.100914image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:49.195634image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:49.288914image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:49.382736image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:49.475675image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:49.564582image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:49.656756image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:49.749584image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:49.838014image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:49.930544image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:50.019721image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:50.117877image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:50.200223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:50.288788image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:50.383077image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:50.470532image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:50.556344image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:50.638202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:50.804413image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:50.890768image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:50.973209image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:51.059382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:51.142477image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:51.233979image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:51.318711image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:51.409908image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:51.502921image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:51.591024image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:51.678594image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:51.762814image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:51.850016image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:51.937847image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:52.021557image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:52.110671image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:52.195167image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:52.287279image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:52.370330image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:52.459656image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:52.551021image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:52.637579image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:52.724201image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:52.806447image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:52.892380image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:52.978328image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:53.059991image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:53.145811image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:53.230289image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:53.321591image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:53.399776image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:53.483384image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:53.568896image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:53.648767image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:53.729367image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:53.809134image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:53.889128image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:53.969894image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:54.144241image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:54.225431image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:54.303717image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:54.390410image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:54.473920image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:54.563340image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:54.654577image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:54.741334image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:54.828885image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:54.914848image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:54.997530image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:55.087125image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:55.170168image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:55.257046image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:55.340871image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:55.432499image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:55.514818image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:55.603124image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:55.694582image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:55.779760image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:55.865795image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:55.950625image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:56.032266image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:56.118805image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:56.201495image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:56.289345image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:56.373559image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:56.464590image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:56.541548image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:56.624711image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:56.710575image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:56.791400image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:56.872878image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:56.953469image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:57.029794image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:57.111694image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:57.193223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:57.273858image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:57.351322image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:57.436446image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:57.518826image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:57.607208image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:57.697382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:57.782749image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:57.868983image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:57.954302image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:58.036369image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:58.246855image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:58.333235image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:58.416124image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:58.499568image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:58.591331image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:58.669985image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:58.752081image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:58.836632image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:58.915911image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:58.996790image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:59.077762image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:59.154236image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:59.234884image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:59.316272image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:59.394022image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:59.475507image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:59.560670image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:59.649982image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:59.745783image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:59.843464image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:33:59.936146image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:34:00.029197image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:34:00.121545image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:34:00.210480image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:34:00.303202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:34:00.395947image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:34:00.484202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-24T09:34:00.576754image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-03-24T09:34:07.047873image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-03-24T09:34:07.193753image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-03-24T09:34:07.340079image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-03-24T09:34:07.501579image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-03-24T09:34:07.685930image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-03-24T09:34:00.786419image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-03-24T09:34:01.481287image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexloantermint_rateinstallmentgradeempl_yearshome_ownershipannual_incincome_vertargetpurposezip_codestatedti_ratiodelinq_2yrsficoinq_last_6mthsdelinq_ago_mthsactive_linespub_recrevol_utiltotal_linestimes_bankruptedtax_lienscredit_hist
005000.036 months10.65162.87B10+ yearsRENT24000.0Verified1.0credit_card860xxAZ27.65no735.01.00.03.0no83.79.0nono322
112500.060 months15.2759.83C< 1 yearRENT30000.0Source Verified0.0car309xxGA1.00no740.05.00.03.0no9.44.0nono152
222400.036 months15.9684.33C10+ yearsRENT12252.0Not Verified1.0small_business606xxIL8.72no735.02.00.02.0no98.510.0nono120
3310000.036 months13.49339.31C10+ yearsRENT49200.0Source Verified1.0other917xxCA20.00no690.01.035.010.0no21.037.0nono189
443000.060 months12.6967.79B1 yearRENT80000.0Source Verified1.0other972xxOR17.94no695.00.038.015.0no53.938.0nono190
555000.036 months7.90156.46A3 yearsRENT36000.0Source Verified1.0wedding852xxAZ11.20no730.03.00.09.0no28.312.0nono84
667000.060 months15.96170.08C8 yearsRENT47004.0Not Verified1.0debt_consolidation280xxNC23.51no690.01.00.07.0no85.611.0nono77
773000.036 months18.64109.43E9 yearsRENT48000.0Source Verified1.0car900xxCA5.35no660.02.00.04.0no87.54.0nono58
885600.060 months21.28152.39F4 yearsOWN40000.0Source Verified0.0small_business958xxCA5.55no675.02.00.011.0no32.613.0nono91
995375.060 months12.69121.45B< 1 yearRENT15000.0Verified0.0other774xxTX18.08no725.00.00.02.0no36.53.0nono86

Last rows

df_indexloantermint_rateinstallmentgradeempl_yearshome_ownershipannual_incincome_vertargetpurposezip_codestatedti_ratiodelinq_2yrsficoinq_last_6mthsdelinq_ago_mthsactive_linespub_recrevol_utiltotal_linestimes_bankruptedtax_lienscredit_hist
423604246719000.036 months16.28670.59F< 1 yearMORTGAGE100000.0Not Verified0.0other300xxGA9.79no660.01.00.05.0no64.16.0nono16
423614246812000.036 months14.38412.37E1 yearRENT15000.0Not Verified1.0educational088xxNJ6.48no670.012.024.04.0no83.74.0nono37
423624246913000.036 months17.22464.95G1 yearRENT32000.0Not Verified0.0other333xxFL15.98no640.01.00.06.0no79.87.0nono27
42363424708000.036 months16.91284.86G10+ yearsOWN60000.0Not Verified0.0credit_card334xxFL20.22no645.01.033.05.0no99.323.0nono205
423644247110000.036 months15.01346.73F< 1 yearRENT50000.0Not Verified0.0credit_card127xxNY7.75no645.01.00.010.0no83.312.0nono50
42365424728000.036 months9.96257.99B3 yearsOWN25000.0Not Verified1.0car069xxCT0.48no700.01.00.03.0no19.23.0nono20
423664247413000.036 months10.91425.04C2 yearsMORTGAGE62000.0Not Verified1.0debt_consolidation282xxNC20.00no695.04.042.023.0yes50.253.0yesno164
423674247610500.036 months15.33365.69F< 1 yearRENT62000.0Not Verified0.0car207xxMD1.72no640.013.043.03.0no95.87.0nono49
42368424772000.036 months13.4367.81E7 yearsRENT45000.0Not Verified1.0credit_card280xxNC8.75no645.08.00.07.0yes54.17.0noyes15
42369424783000.036 months13.75102.17E< 1 yearRENT12000.0Not Verified0.0debt_consolidation321xxFL19.30yes665.04.09.012.0no14.414.0nono186